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DIGITAL *AIPCfiAFT MULTISPECTRAL SCANNER 
DATA TAPE FORMAT 


y« Borden 


The format herein described was designed for any 
multispectral scanner data collected from an airborne : 
platform. All programs written for the Office for Remote 
Sensing of Earth Resources (ORSER) at The Pennsylvania State 
University for processing this kind of data will accept this 
format- Digital tapes with other formats, such as the 
"Aircraft Data Storage Tape Format" of LARSYS Version 2, = can 
be reformatted to agree with these specifications without 
serious difficulty, . 
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D ata Qr ga n izat ioa ■ 

Data on the tape is organized according to flightlines, 
with each flightline of data comprising a unique file^ The 
full set or a subset of the full set of data for a 
flightline consists of a group of physical and logical tape . 
records. One or more flightline data sets can be stored on 
one tape and an incomplete set can be continued from one 
tape to another. . 

Each file of flightline data is composed of five kinds 
of records as follows and in the order given; 

1. File identification ^ 1 or more records, each 250 
words in length. 

2. , Table of contents *= 1 record of 400 words in 

length. 

3. Multispectral scanner response records “ 1 or more 
records per scan line containing all channel 
responses for all scan line elements. 

4. History record sets ~ 1 set of records for each 
main program execution which caused the .file to 
be modified. 

5. End of file record “ 1 record 250 words in length- , 
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File Id enti fica t ion Record 

The file identification record contains the information 
relating to the data set for the flight line. It should 
agree with the external documentary information for the 
flightline. , The record is fixed in length at 250 words. 


Word 

Format 

Con tents 

1-3 

Alpha 

Flightline or user ID (12 characters) 

4 

Integer 

Continuation code 



0 ^ No ID records following 



first one 



n - n ID records following 



first one 

5 

II 

Number of data channels 

6 

It 

Original number of elements per scan line 

7-10 

Alpha 

OESEH external tape label <16 characters) 

11 

Integer 

Month data were collected 

12 

It 

Day data were collected 

13 

If 

Year data were collected 

14 

It 

Time of day data were collected 

15 

tt 

Altitude above ground of aircraft 

16 

tt 

Ground heading of aircraft 
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17-19 

Alpha 

Date this specific data set was 
prepared (12 characters) 

20 

Integer 

Air speed (mph) 

21 

A Ipha 

Type of original 

tape: ERTS^ C130, U2^ LARS 

22-25 

A Ipha 

Platform description 

2 6-30 

Alpha 

Scanner description . 

31 

Integer 

Milliradians per element 
1 = present 
0 = absent 

33-37 

Alph a 

Name of user who created this data set 

38-41 

Alpha 

ORSEB external label of subset 
source tape 

42 

Integer 

Subset source tape file number 

43 

Integer 

File number of this tape 

44-50 


Onused 

51 

Integer 

Number of first spectral band 
(channel) on file 

52 

Real 

Lower limits in micrometers of first 
spectral band in file 

53 

u 

Upper limits in micrometers of first 
spectral band in file 

54 

ft 

0 or suggested value of Cq calibration 
pulse 
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55 Heal 0 or suggested value of C ^ calibration 

pulse 

56 ” 0 or suggested value of calibration 

pulse 

57-199 ” Repetition of description for words 51n56 

applied to other channels, in file in 
order of appearance in data 
200-250 Alpha ERTS ID record if this tape was 

generated from a NASA-ERTS tape 
If the tape has been generated from C130^ D2, or LARS 
aircraft data, a second ID record will be present that will ^ 
contain the original ID record from the original tape* 

Ta bl e o f Contents Record 

The table of contents record contains the list of all 
data blocks in the file* A data block is defined as all 
data from a beginning scan line through an ending scan line 
including all elements in each scan line beginning with a 
given element ;number and ending with a given element number* 
As many as 50 different blocks can exist in the :file<. The 
table of contents is a hOO-word fixed-length record composed 
of 50 eight-word sets* Each non-xero set applies to one of 
the blocks in the file* The .specifications for the first 


set are as follows: 
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Wo rd - Format 
1 Integer 

51 " 

101 ” 

151 ” 

201 « 


251 ” 

301 '» 

351 " 


Contents 

Beginning scan line number for the block 
Ending scan line number for the ibiock 
Beginning element number for each line 
in the block 

Ending element number for each line in ; 
the block . 

Increment for scan line numbers in the - 
block; i«,e«, an increment of 1 , means 
every line is presentp whereas an 
increment of 3 means every third line 
is present 

Increment for element numbers in all scan 
lines of the block 
Number of scan lines in the block 
Number of elements in a scan line 


Mult ispectra l Sca nn er Re spo nse Ee co r ds - 

One or more records exist for each scan line of data 
and include all selected elements and all channels for -that 
scan line. If the number of elements per scan line is 222 
or less, and the number of channels is 13 or less, one 
record contains all data for the scan line. Otherwise, the 
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additional data is contained, in the same format on ;continua 
tion records- 


B yte 

1~2 

3-4 

5-6 

7-8 

9-10 

n-12 

13-ni 


C onten ts 

Scan line number 
Roll parameter 

Beginning element nnmber in the line 
Ending element number in the line 
Element number increment 
Continuation code 
Responses ordered as follows: 
first channel p first element 
first channel, second element 


first channelp last element 

calibration data for first channel (8 bytes) 
second channel, first element 
second channel, second element 


last channel, first element 


In = ra ♦ (e + 8) ^ 12 for m number of channels and 

e number of elements for the line, n <2976- 
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last channel^ second element 


last channel, last element • 

n-7 to n calibration data for last channel <8 bytes) 
H ist o ry Record s 

Each step in the data processing that generates a 
modified file will cause a set of history records to be 
added to the modified file. This set will be : added to any 
prior history records. 

His tory Header Rec or d. The first record of the set is 
an eight-word fixed-length record. The format of the header 
record is as follows; 

forma t C ontent s 

1-4 Alpha Name of program that made data 

set modification <16 characters) 

5-6 ” Run date 

7 Integer Run identification 
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If 


Number of records following this one 
for this set <n) 



9 


H isto ry Data Record s^ Following each header record n„ 
the value in word eight of the ihistory record ^ records with 
an 80 A 1 format follow « These records are -typically the 
control card images for the run* 

E nd of F il e Reco rd 

The end of file record- is fixed in length at 250 -words 
and is a reproduction of the identification record of the 
file. 


Dat a Su bsets 

For computer use cost economy^ it is desirable to 
minimize tape processing time. For this reason it is 
anticipated that a user will construct one or more subsets 
of data to delete sections that are of no. interest to him. 
Data to be used frequently in processing would be selected 
in order to omit the long tape search time required if the 
data had to be selected from the full set each time. To do 
this the user selects blocks of data he is interested in and 
prepares a tape containing only those blocks* All subset • 
data tapes have this same format and all programs have been 
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designed to accept this format ifhether from an original tape 
or from any subset thereof* 

The table of contents, as well as the history records, 
is valuable for this reason from the point, of view of 
control of the data for the user* Use of the program to 
construct data subsets is described elsewhere, but the 
programs can be used to construct smaller subsets of data 
from a subset data tape* Other programs, for example 
sampling programs, also can be expected to modify data sets* 
The necessity and value of the table of contents and the 
history records are therefore clear* 
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TPINFO Program Description 


H- J5, Lac how ski 


The primary purpose of TPINFO is to output information 
for an original or any SUBSET data tape containing digitized 
aircraft multispectral scanner data « The inf ormation ;of 
interest to the user is contained in the Identification (ID) - 
record and in the Table of Contents records at the :beginning 
of each tape. , Detailed descriptions of these records may be 
found in the Digital Aircraft Multispectral Scanner Data 
Format documentation. 

By using control cards, the user may request the 
following output from the TPINFO program: 

1. , ID record, 

2. Table of Contents record, 

3. , Response record (the first record following, the 

Table of Contents) , and 

4. HISTORY records. 

The TPINFO program is intended mainly for the user who 
is not sure about the f light line .name , certain parameters 


( 



such as the number of observations per scan line, the number 
of data channels, or what blocks of data are contained on a 
given tape file. In most cases, the ID and the Table of Contents 
records will be sufficient. 
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SUBSET PROGRAM DESCRIPTION 


F« Borden and H* M- lachowski 


The SUBSET program is used primarily to increase tape 
processing efficiency and to reduce computation cost-» A 
flightline generally results in a large digital file 
frequently consisting of more than one full tape reel* A 
user is usually interested in only relatively small parts of 
a flightline. The SUBSET program allows the : user to specify 
the parts of a flightline he is interested in and to 
construct onto his tape a subset of the data that contains 
only the data he specifies. Cnee the subset data tape has 
been constructed, subsequent processing using this tape 
avoids the costly bypassing of unwanted data as would be the 
case if the original tape were to be used. The SUBSET 
program can also be used to select a smaller subset from a 

tape that has been constructed as a subset of the original 

( 

or other prior subset of data. 
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A typical example of fhe use of SUBSET follows* 

Suppose a user desires to stud y certain soil areas that he 
can identify in a general way from aerial photography 
associated with a flightline« . SUBSET can be used to 
construct a subset of the flightline data containing only 
the soil areas to be investigated* Once this tape has been 
prepared^, the user may want to select a number of smaller 
areas that contain the training areas to be used to develop 
classification parameters for statistical classification 
procedures^ , This could be done by constructing another 
subset of the data using SUBSET with the first subset data 
tape used as input* When the c la ssif ication parameters have 
been estimated to the user^'s satisfaction using the second 
subset of the data^? the cl .as si f ication could be run on the 
first subset* In this way a minimum amount of unused 
data would have to be passed in each computer run and a 
substantial cost savings would result compared to using the 
original full flightline of data for each run* 



Every subset of data output by SUBSET has the same tape 
format as the original data tape« Every subset tape can be 
processed by any of the programs that can operate on the 
original data tape-. The detailed description of the tape 
format is given in the ^'Digital Aircraft Multispectr al 
Scanner Data Tape Format” manual* The .identification record 
of the source tape is reproduced on the subset tape with 
only the name of the tape changed to the name specified by 
the user* The table of contents record for the subset tape 
is constructed from the input specifications of the user and 
the table of contents from the input tape- The table of 
contents specifies exactly the contents of the subset tape* 
The data on every subset tape is always in the order of 
increasing scan line number- There are no duplications of 
line numbers and no out-of-order scan lines- The section 
describing block restructuring presents more details 
regarding the record organization for the scan lines- The 
history records are reproduced onto the subset tape and 
augmented appropriately for the run- Any inputting 
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subroutines that work for the original tape^, such as GETLIN^ 
will function properly for any subset data tape* 


Sp e cif yin g a Sub set 

A subset of data is defined by specifying one or more 
blocks of data that will be selected from the source set to 
be put on the subset tape* A block is composed of data 
beginning with a designated scan line and ending with a 
designated line and including elements in each line from a 
beginning element number through an ending element number* 
This is comparable to a rectangular area along the 
flightline with two sides parallel to the flightline* 

Not all lines are necessarily included in a blocks a. line 
sampling increment can be used to prescribe the spacing 
between lines to be selected* In the same way^ -not all 
elements within the limits need to be selected since an 
element sampling increment can be prescribed* In specifying 
blocks^ the source data tape table of contents must be 
consulted to insure that the source from which the subset is 
chosen contains the desired lines and elements within lines* 
A number of inconsistencies between the requested block 
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delimiters and the actual data available on the source 
tape are allowed and adjustments are automatically made 
in the block requests. However, to insure a completely 
satisfactory subset, the data available, as given in the 
table of contents of the source tape, should be used as a 
reference in specifying blocks for the subset. . The rules 
governing the adjustments for inconsistencies are covered in 
a later section. 

It is expected that blocks in the requested subset will 
frequently overlap. Two types of overlap can occur. First, 
blocks can spatially overlap in that one may be wholly or 
partly included in one or more others. , Second , blocks may 
overlap by having some or all scan lines in common, but not 
be spatially overlapping. Since the subset tape will 
contain no duplicate line numbers and will contain all 
elements in a scan line ; withi n only two delimiters, the 
requested blocks in overlapping cases will be restructured 
internally. The exact nature of the restructuring is 
described in a later section. However, the SUBSET program 
will cause all of the required data to be present in the 
subset if the requests are consistent with the available 
set. For each inconsistency, a reasonable substitution will 
be made if possible; otherwise, the requested block will be 
overlooked.^ Output messages fully cover the unusual 
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situations*. In additioUp in every run the requested table 
of contents and an actual table of contents for the. output 
subset are printed^ , SIJBSETp in all casesp will generate the 
minimum subset of data that includes all of the requested 
data (considering substitutions or deletions as above) 
within the constraints of the data tape format and 
organization*, 


Blo ck Be str uct uring 

Whenever requested blocks of data overlapp the 
overlapping blocks are restructured automatically into non-^ 
overlapping blocks*. None of the requested data will be lost 
in the process p but, frequently, additional data. will be 
included as a result of the restructuring*, An additional 
kind of overlapping brin gs about restructuring; . i*.^*. a 
requested block that does not fall completely within a block 
is partitioned into two parts, the first of which does fall 
completely within one of the input tape blocks. The second 
partition is put back .in the table of requests and is 
considered as a separate block later when its turn comes in 


the run. 
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Restructuring, is necessary where overlapping occurs to 
avoid duplication of scan line numbers in the subset tape : 
which y in turn# circumvents complicated programing and . 
processing for subset tapes* On the other hand# automatic 
restructuring releases the user from undue restrictions or 
complications in specifying the blocks he wants- 
In the rrestructuring, overlapping blocks are 
partitioned and recombined- Where recombination takes 
place, the smallest line increment and element increment of 
the partitions of the original requested blocks apply- The 
table of contents is changed to agree exactly with the 
restructuring. 

The rules for re structuring two overlapping blocks, 
block i and block i-1 are given below and illustrated in 
Figures 1 through 5- v The overlapping condition is given in. 
the diagrams on the left side of the arrow and the result of 
the restructuring is given, on the right- The smallest scan 
line for a block :is at the bottom of the : rectangle . The 
tests for the various conditions are applied repetitiously 
for each block i an d restructuring is made when necessary 
until# finally, no overlapping remains- 

If one block is completely included in another and the 
element increments are .the same, the .included block is 



deleted as a separate entity (Figure 1) c , If the element 
increments are not the .same, the blocks are reconstructed- 
For two blocks in which the scan lines overlap and do 
not begin on the same line, the block with the lowest 
numbered scan line is partitioned- The first part ends one 
line before the beginning line of the overlapping block and 
the second part covers the remainder , beginning at the same 
line as for the overlapping block (see Figures 2 and 3) * 

For a second block that is a continuation of a first 
block, the two are combined when the line and element 
increments match (Figure 4). 

For two blocks in which the scan lines overlap and have 
the same beginning scan line, the blocks are restructured so 
that one includes all of one of the original blocks and part 
of the other and the other restructured block contains what > 
is left over (Figure 5)- In Figure 5 the data within the 
dashed (*) lines is included although .it was not requested. . 
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Figure 2 
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Figure 3 


i-1 


i“ 1 


Figure 4 
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r 1 
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Figure 5 
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NMAP PHOGRAK DESCBIPTION 


Fo Y« Borden 


The primary purpose for which the NMAP program was 
designed is to assist the user in recognizing and visually 
correlating blocks of raultispectral scanner remote sensor 
data on tape with areas seen on photographic imagery of the 
same flightline or scene* The user specifies the blocks on 
the tape to be mapped, the map symbols and class limits for 
classification, and the spectral bands (channels) to be 
used. The principal output consists of a map for the 
blocks specified according to the map symbols and class 
specifications used, with numbered scan lines and elements. 


Comp uta t io na l Methods 

The method is based on the norm of each observation in 
the data- An observation consists of the set of values for 
all channels for a single element in a scan line. For p 
channels, the observation for element j in scan line i can 
be represented as a p- valued vector. 
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The norm of the observation | i j H is then 



Geometrically the norm is simply the length of the vector^ 
Xij, in p-dimensional space. 

The norm of each observation is computed and 
transformed into the percentage of the maximum possible 
value for the norm. It is then translated into a mapping 
symbol of the class for which it falls within the class 
percentage limits. 

The maximum possible norm value depends on the number 
of grey scale levels for each channel. For p channels and 
Ug, n^ grey scale levels in each of the p channels, 

the maximum possible norm value would be 
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P 

2 (n j 
1 


1 ) 


1/2 


In most cases the number of levels is 64, 128, or 256* 


Use of the 


NMAP uses original or subset tapes in the format 
defined in the manual, "Digital Aircraft Multispectral Data 
Tape Format," Control cards are used to specify the 
flightline or scene (tape file name) which is to be used, to 
specify the blocks to be mapped according to scan line and 
element designations, and to specify mapping symbols and 
limiting percentages for classes to be used. Default 
options which are appropriate in most situations minimize 
control card preparation^ Output consists of a title page 
with the control specifications for the run, map pages for 
each block requested, a summary of the classification 
results for each block, and a table of the frequency 
distribution by one percentiles for each block. 
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[IMAP PROGRAM DESCRIPTION 


Borden 


UMAP is useful for. identif ying areas of uniforiuity and 
non-unif ormity in the remote sensor data by mappingo . Such 
maps are valuable in the identification of suitably uniforEi 
areas for use as training fields for other analytical 
programs.. During intermediate stages of analyses^, the maps 
are useful as guidelines in judging the adequacy of maps 
from clustering and classification analyses^ An .alternate 
use of the program is for the delineation of high contrasts 
and boundaries of contrasting ,areas« 


Co mputati onal Metho ds 


The absolute value of the Euclidean distance between 
t he end"- points of two vectors is D« Let X ^ be the vector 

Geometrically these define 


r_ 




a nd X g be 


Xig 



• 


• 

• 



* 


- 



2 P 
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vectors with a common beginning point at the origin and 
and Xg as the end-points in p-dimensional space« In remote 
sensor data, each vector is composed of the set of responses 
for the spectral bands defined by the multi spectral scanner; 
used to obtain the data* The squared distance between the 

two end-points, is found as (X^ ° ® 

P 

2 (Xi i - Xgi)2, If D is small for a pair of vectors, the 

i = 1 

vectors are geometrically close together and numerically 
similar- . A large value of B- indicates a large contrast ; 
between the vector pair or a strong dissimilarity. . 

Contrasts are computed in this way in OHAP and then 
translated into mapping symbols and mapped in the output. 

The maximum value of a response in a channel depends on 
the number of grey scale levels for that. channel<, For p 
channels and nx ? ng, np grey scale levels in each of 

the p channels, the 

/ P 

D are f S (ni - 1) 

Every point is identified fay a scan line rnumber and an 
element number within the scan line. Four D values are 
computed for each point, using as the other member of the 
pair for a D one of its near neighbors. Let the subscript i 


absolute maximum and minimum values for 
and 0, respectively. 
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designate the ith scan line and let j designate the jth 
element. The following values are computed: 


d, 


X, (X, 


X, . . . ) 

J -J- A ^ 


"l 0 

"a/ f, j 


(X, 


do 


j)' (Xi 






d. 




’ !' Y 

— ^ j 

d.. 


X, 


J' a... 


Xi. ..m) 


The value D^j is assigned the maximum Dk i. jl k = 2g 
3, 4. The incremental spatial distance between the two 
elements of the pair is taken into consideration by d ,p dgp 
and da. The reciprocal of these . is the weighting value :in a 
linear interpolation sense of the distance between the 
elements. in the case of every line and every element 
processing, d. is 1, the increment between two neighboring 
elements in the same line. Similarly, d g is 1 for the 
spatial increment between two elements in neighboring lines 
in the same position in each line. The value of dg is the 
hypotenuse value, /2, for two elements each on a line and 
an element position differing by one increment. 



30 


As detailed in a later section, it is possible to 
process, or have available for processing, data that are 
other than every line and every elements For data that are 
not every line and every element, di would be the number of 
increments separating two neighboring elements on the same 
line, dg would be the number of increments separating 
elements on neighboring lines in the same element position, 
and ds would be ~ + d „ 

Values of Dij are converted to symbols prior to 
mapping. First each dij is translated to a 0-100 scale by 
Dj j — 100 ® m 1 n) / ( n a X “ ^ m 1 n) * The Value D jj a x is 

/ V \ , 

E <ni - 1)^ Diin is 0 as described earlier. Each 

Dj j is then assigned to the class within which limits it 
falls. The number of classes, the class limits, and the 
symbol for each : cl ass are under the control of the user. 

The symbol for the class within which the falls is 

printed: in position i, j of the output map. Class limits 
and their specifications are treated in more idetail in a 


later section. 



31 


Use of t he P rogra m 

UMAP has been written to use tapes with the format 
described in the raanualp “Digital Aircraft Hultispectral 
Data Tape Format,” The input tape may contain a subset ,of 
the original data as processed by the . SUBSET program. 

Control cards are used to do the following; 

1, , check the name of the specific input file to 
insure the correct file will be processed; 

2o identify blocks of data to be processed; 

3„ , define classes^, class limits^ and class 
symbols to be used; and 

4. identify selected channels (spectral bsinds) 
of data to be used, 

A control card naming the tape by its internal name is 
used if it is desired to check for the correct input tape,. . 
As many as 50 cardSj, each one specifying a block of data, 
may be used to select areas to be processed. Each block is 
defined by a beginning scan line^; an ending scan line, a 
beginning element for all lines, and an ending element for 
all lines. The requested block must be contained within one 
of the tape file blocks. The specifications of the file 
blocks are obtainable from the SUBSET program output Table 
of Contents for the run that generated the tape or by use of 
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the TPINFO program. As an alternative option, all data in 
the file may be requested for processing. A line increment 
and an element, increment may be specified on each block . 
card. Reference should be made to the section on the 
function of line and element ’ increment s for a detailed 
description of their use. 

One of three vays may be used for defining classes. A 
control card may be used for each class to be defined for as 
many as ten classes. The class control card will contain 
the upper limit of the class on a percentage basis, and the 
symbol to be mapped for map elements that ’have :a D’ value 
greater than the limit for the next lower class and less 
than or equal to the upper limit for this class. If the 
highest class limit is less than 100., each d! . that is 
greater than the highest specified upper limit will be 
printed as a blank. 

The second means of defining classes, is to input them 
on a control card that contains eight symbols, one for each: 
class. In this case, the class limits are automatically 
set. The third means for class definition is by default. ■ 

In this case, no control card is used and the : following 
class definitions apply: 
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Class-Litait 

3 « 


20 « 

100 . 


Class Sy rofeel 
U 




Output from the program consists of a title page with 
the general specifications for the run and a set of pages 
for each requested block giving the .block .specifications and 
the map of the block as well as a summary, of the data. The 
summary contains the number of observations in each> classp 
the overall average D valuep and the maximum and minimum D 
values found, in the data for the block. In addition^ a 
table of the frequency distribution of D values by one 
percentiles is output. Blocks are sorted, internally andp 
within each blocks each line is mapped as it is encountered. 


The Functi on of l ine an d Element I ncremen ts 

The line increment and the element increment in a block 
specification determine the selection of lines and elements 
within lines that are processed within the limits of the 
specified block. , They may be left unspecified on the 
control card in which case the corresponding increments for : 



34 


the tape file block _will be applied^ It is necessary' to 
spell oat their function because their function does exert 
influence over the :coffiputation of the D values and the. 
interpretation of the output « 

In all cases except for the one described later in this 
section^ only four elements are entered into the computation 
of a Di j value* The four are taken from the corners of the 
spatial rectangle having the subscripts (ip j), (1, j + m),^- 

(i + kp j) , and (i + kp j + ra) where k .is the line increment 
and m is the element increment* Each of the four components 
of Dij uses the appropriate weighting coefficient with 
= ffip dg = k, and + k^. 

One special case exists for theiUHAP program, that lis 
when every line and element is available on tape:for a 
requested block . and wher e t he. line and element increments 
are specified by the ; user as 2* In this case, every other 
line and every other element,, in each line are output* 
However, instead of only four D values being compared, as 
discussed in the section on computational methods, 16 D . 
values are compared with the largest one chosen for Dij* 

The 16 D values arise from four .sets of four D values each.' / 
The value for D^jis found by taking the maximum of the four 
following maximums: Df j , j + + and:Di + i^ j+’w where 

each of these :has been found as the maximum of comparisons 
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according to the procedure presented in the computational 
methods section- 

It should be pointed out that this is not the same as 
taking the maximum of the D values computed using only the 
four corner elements^ j, Xi^ j + 3» ^i + s, j' ^1 + 2, j+s- 


Data N or m al i z at io n 


UMAP has the option for using unaltered data or 
normalized data* Normalized data is data that has been 
transformed in the following way* Let Z be a vector of 
normalized data* The normalization requirement is that Z* Z 
= 1 , which can be accomplished by computing Z = (X* X* 

This amounts to computing the sum of squares, of the vector 
elements of X and then dividing each element by the square, 
root of this value- , 

For unnormalized data^ D as the distance -between the 
end points of two vectors in p^dimensional space is 
influenced by the vector lengths as well as any angular 
separation between the vectors- A value of D computed with 
normalized data is influenced only by the angular separation 
between the .two vectors* In normalized data the vectors 
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Figure 1 


have unit length so that for the triangle formed by the 
origin and the vector end-points as shown in Figure . 1 , with 
6 as the angle of separation^ sin 9/2 = -d/2 and 9=2 arcsin 
(d/2). In normalized data then, only angular separation or 
the difference between relati ve reflectances are important. 

If normalized data is to be used in a classification 
program, the training areas as identified using this program 
should be based on normalized data. If unnormalized data is 
to be used in a classification program, then this program 
should be run using unnormalized data. ,< 
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STATS PBOGBAH DESCRIPTION 


H. M, LachowsKi and Y« Borden 


The purpose of the STATS program is to obtain basic 
statistical information for remote sensor data target areas 
within a flight path- These areas are frequently referred 
to as "training areas" and are used to estimate statistical 
parameters for specific targets- Training areas of any 
polygonal shape can be accommodated by .STATS, A training 
area may be composed of distinct subareas for which the 
composite of these subareas would be processed as one unit- 
An area or a subarea is defined according to the coordinates 
of the corners on its perimeter and all of the data 
available within the area are used- 

For each area, STATS computes and. outputs the vector of 
means, the vector of standard deviations, and the variance- 
covariance matrix using all of the channels selected by the 
user- By option, it computes and outputs the correlation 
matrix, the frequency histograms for specified channels, and 
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the eigenvalues and eigenvectors for the variance*- covariance 
or correlation matrices^ In addition to printed output^ 
optional output can be obtained in punch-card foriSi? on 
magnetic tapep or in disk files „ 

It is recommended for efficiency in computer use that 
this program be used after the user has constructed a subset 
tape {described in SUBSET) that contains all of the parts of 
a flightline of interest to the user« , In this wayp the 
costly bypassing of unwanted data on the orignal tape .can be 
avoided. , 

CoHiput ationa l Meth o ds 


A rea Bou nds 

Each area for which statistics are to be computed is 
defined by the pairs of coordinates that designate the 
corners of a polygon. The coordinate pairs must be input 'in 
the order of their occurrence in either a clockwise or 
counterclockwise direction. The polygon need not be regular 
or convex but may be of any desired shape. It may have no 
more than forty sidesp however. Two or more spatial areas 
(subareas) may be combined into one computation area 
(composite area) p as long as the spatial areas do not 
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overlap and their combined number of sides do not exceed 
forty„ In the following discussion a polygon defined by n 
pairs of coordinates will be considered*, Let pj be the ith 
point defined by the ith pair of coordinates*. The 
coordinates are integers with the first being the scan line 
number and the second the element number in the scan line- 
The bounds of an area as defined by ^ i = ^ n ^ are 

translated to the beginning and ending element bounds for 
each scan line that appears in the area*. This is done by 

computing the element for each scan line that is nearest to 

\ 

the line segment with end-points Pj and Pi+i« . Each Pj , 
i = 1, 2e n- 1 , is processed in this manner ^ finishing 

with the closing side using Pn and Pj „ Since any shape 
polygon is allowed^ it is possible for the boundary line to 
cross back and forth across one or more scan lines* ,, In such 
cases, each affected scan line would have more than one pair 
of beginning and ending element values* As many as ten 
pairs of such values are allowed, which means the boundary 
line may cross a scan line as many as twenty times* This 
limitation is not likely to be important except in very 
unusual situations* A long winding path of a stream that is 
in a general direction parallel to the scan lines is. one 
case where the boundary of the training area may cross some 
scan lines a number of times* 
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Stati stics 


Let B be the set of all pairs of coordinates in a 
bounded area* Consider X i j as the vector for scai 


line i and element j composed of a response value :for each 
of p channels* The mean vector and the variance-covariance : 
matrix, c, are based on the Xjj for all {i, j) in B* The 
matrix C and x are computed in a typical and straightforward 
way and therefore not presented here* If the correlation 
matrix, R, is requested, it is computed in place of the .C ' 
array area using the variance and covariance values* - 


The user has the option of having either or both the C 
and the R matrices output* Eigenvalues and eigenvectors may 
be computed and output based on either the c or the R 


matrices. The specifications for the eigenvalue-eigenvector 
computations are as follows, using A as the diagonal 
eigenvalue matrix, A as the eigenvector matrix, and the 
subscripts c and r as designators for the .computations based 
on the C or R matrices, respectively: 
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C = Ac “ Ac Ac ; Ac * Ac = I 


R = A,« A, A,; A / A , = I 


except for a trivial case where C = R^ Ac ^ A ^ and Ac ^ Ai- 


U se of the Progra m 

STATS uses original or subset tapes in the Digital 
Aircraft Multispectral Data Tape Format, The deck of 
control cards consists of a flightline name card that is 
optional followed by one or more sets of cards, each of 
which defines an area and the computations for the target 
for which the training area applies. The deck is completed 
by an END card. Control cards within each set of area cards 
are used to do the following: 

1, specify whether unnormalized or normalized data 
is to be used, 

2, specify the channels to be used, 

3o request the eigenvalues and eigenvectors 

for either the var iance-covaria nee ■ matrix or 
the correlation matrix. 



4. specify output options, 

5. specify the name of the category for which 
the training area applies, and 

6. specify the channels for which frequency 
histograms are to be computed and output. 

Output consists of a title page with the control 
specifications for the run and, for each training area, the 
name of the category followed by the mean and standard 
deviation vectors, the variance-covariance matrix, and the 
other requested statistical information. 
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AC LASS PBOGBAM DESCRIPTION 


F, Y« Borden 


ACL ASS has as its purpose the classification and 
mapping of multispecral scanner remote: sensor data- Each: 
category is defined by a set of responses^ one for each 
channel (spectral band) « These data^ with the category name 
and mapping symbol, form part of the input, / Blocks of data 
to be classified and mapped are also specified by input. 

The data are classified according to their angle of 
separation, in a multidimensional geometric ^sense, from each 
of the categories, with the classification made into the 
category for which the angle is smallest, . Each data unit".\is 
translated into the mapping symbol for the category:, to which 
it was assigned, !3ap output is made on a scan line by scan 
line basis for each specified block. 

Additional output consists of auxiliary tables, one of 
which indicates the angles of separation between all pairs 
of categories specified. The program may be run for this 
information alone by deleting the map from computation. 
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Computational Methods 


The classification of each selected element is made 
using its normalized vector and is based on its angle of 
separation from each of the normalized category vectors. An 
element in position j of scan line i will be represented as 
the vector Xij = Xiji where the components of the vector 

Xl J2 


1 J p J 

are the responses in each of p channels (spectral bands) . , 
The equivalent normalized vector is which is computed 

as Z i j = ^ 1 j) In other words, each 

2l JK 

For the normalized vectors, Z> Z = 1. Let C„ be the 
normalized response vector for class m, m= 1, 2 , n, 

for the same p channels as in Xjj. The Cj, vectors are 
input, having been estimated or established by means 
external to the. program. Actually, the unnormalized 
Cq. vectors can be input, since the program will normalize 
them. The reason for using normalized vectors instead of 
unnormalized vectors is that the assumption does not have to 
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be made that each vector has the same magnitude of 
response in each .channel as the whichp in fact, belong 

to category m. Furthermore, the assumption does not have to 
be made that any two Xjj, which are, in fact, in a single 
category, have the same magnitudes of response in each 
channel no matter on which side of the nadir each may occur 
and no matter where along the flightline each may be 
located. For example, for each the values may have been 

estimated from data from an entirely different flightline or 
from a laboratory spectral analysis. Using normalized 
vectors eliminates the need for the vectors to be 
estimated from data from the flightline under analysis with 
its particular sun angle, general brightness, etcw, which 
are characteristics more or less unique to the flightline. 

The angle of separation between the vectors and 

Cj, is the criterion variable used for classification. The 
constraints are discussed later, but overlooking the 
constraints for the moment, j will be classified as 
belonging to category m if the angle between Zij and C ^ is 
smaller than for any other C^, ^ = 1, 2, n and i / m* 

Any pair of vectors in p-dimensions define three points: the 
common origin and the end'-point of each vector. The three 
points can be considered as a plane tiangle, as shown in 
Figure 1. . As a result of normalization, two sides have unit 
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Figure 1 


length, and d is the distance between the end-points^ For 
the vector pair, Zjj and let dijjj, be the distance 

between the end-points. The di^^j are computable from 

P 

di J - CJ« (Zi J “ cj = E <Zi J, - C„ )2 

k=1 

From Figure 1, sin(6/2) = d/2 from which ^ can be evaluated 

as 0 = 2 arcsin (d/2) . Actually, in the program, these .steps 
beyond the computation of d i j do not take place inasmuch 
as only the smallest di^^^ found to identify the 



47 


value of m for which the smallest 6 exists* This increases 
computational efficiency* 

Constraints exist for the classification and these are 
in the form of a maximum allowable angle of separation, A„, 
between Zi j and If 9ijo is greater than Z.j will 

not be classified as belonging to category m* The angular 
values of m = 1, 2, n, for n classes are input or 

set by default and are actually converted to critical 
distances so that the 9®s do not have to be computed for the 
reason given above* The classification criterion is applied 
only to classes for which the above constraint is not 
violated* If the constraints are not met for any of the 
classes, then the observation is classified as «other-»' 

Since the A^r m = 1 , 2, *.*, n, do not have to have the same 

value, it is possible that a 2ij could be classified as 

belonging to class m for which the d i j bp overlooking the 
constraints, is not the smallest* This could occur in cases 
where the Aj, was large compared to the A^^, for class ^ for 

which the d i j was the smallest but not small enough to meet 

the constraint imposed by A 

To improve computational efficiency, since the program 
is not limited entirely by tape processing time, the 
following scheme was programed- Let b be the smallest 
distance of separation for all dgh, g = 2,-*-, n and 
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h = 1, 2, 0 * 0 , (g~l) f where dg h = tC g - Ch > * (Cg - Ch)« 

If, in the classification process for Z^j, a class is 
encountered for which d^jj, < b/2 and djj does not violate 
the constraint for class m, can be assigned to class m 

forthwith with no further investigation of other classes. 

In addition j if the first class, m, to which Zjj is compared 
is the one for which Zi j_i was assigned, it is most 
probable that Z^j will be assigned to class m. The result 
would be that only one comparison would be made in the 
majority of cases instead of n, the number of classes, for 
each Zi j processed. The program was constructed in this 
way because, in this kind, of data, the probability that 
neighboring observations are of the same target is very 
high. This programing feature improved running time to 
nearly tape speed. 

Once an observation has been assigned to a category, 
the map symbol is assigned to the ; position. After a scan 
line is completed, the line of map symbols is output. 


Us e of the Pr ogra m 


ACL ASS uses original or subset data tapes in the 
Digital Aircraft Multispectral Scanner Data Tape Format, 
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Control cards are used to specify the flightline : (tape file 
name) that is to be processed, the channels to be used, and 
the blocks to be classified and mapped according to scan 
line and element designations. In addition^ they are the 
means of specifying the spectral characteristics of each of 
the categories, the category mapping symbol for each 
category, and the angular limit for each category vithin 
which an unknown must fall to qualify for consideration as 
belonging to the category. The output consists of a title 
page with the general specifications, including the table of 
angular separations for all pairs of categories. Map pages 
are output for each block and a summary table is output of 
the classification results by number and percentage in each 
category. A control card allows the user to choose not to 
have the map output, which saves substantially on run time 
since the remote-sensor data do not have to be processed in 
this case. , This option is useful when the primary interest 
for the run is in the angular separation of pairs of 
categories and in the comparison of the spectral 
characteristics of the categories. 
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DCLASS PROGRAM DESCRIPTION 
F. Y. Borden 


This program works exactly like ACLASS, except that it 
does not normalize the signature vectors and, therefore must 
use the distance between any two vectors instead of the angle. 
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ACLUS PBOGRAH DESCRIPTION 


B« J « Turner 


The purpose of ACLDS is the unsupervised classification 
and digital mapping of m ultispectral remote sensor data* It 
differs from ACLASS in that the user is not required to 
specify a set of spectral signatures initially* ACLUS 
develops its own set of spectral signatures using a 
clustering algorithm and outputs a map on the basis of 
these* The • intensity of clustering and the : intensity of 
sampling of the. data to >form these clusters are under user 
control* , 


Comp utatio nal Methods - 

Remote sensor data is supplied to the program on 
digital magnetic tape in standard format* The user 
specifies by a control deck of cards or teletypewriter 
records; (a) the corner coordinates of the block (s) to be 
processed, (b) the number of sample points to be initially 
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chosen, and <c) the initial critical clustering angle, which 
is discussed later* 

The program uses the block specifications to randomly 
select the required number of sample points, storing the 
coordinates in an array* These are then sorted by scan 
line number and element number within lines* If there are 
multiple blocks, then the number of sample points is 
allocated to each block in proportion to its size* . 

The clustering algorithm developed for ACLDS was 
influenced by a method suggested by Tryon and Baileyi as 
being useful when the number of observations is very large* 
The first stage of this method, which they called "iterative 
condensation on centroids," requires that trial group - 
centroids be set up and each point is assigned to that group 
with which it has its smallest euclidean distance- After 
all have been assigned, the centroid coordinates are 
computed and the process iterated until no change in 
allocation occurs* 

In the ACLOS program the initial centroids are computed 
from the first scan line in the specified block and from the 

iR* C* ,Tyron and D* E* Bailey* 1972* Cluste r 
Anal ysis, McGraw-Hill, New York, N* Y* pp- . 147-150*' 
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user^-supplied initial critical angle^ 0c. If the vector of 
spectral data for the jth element within the ith scan line 
is designated as and its normalized analogue as 

where Zjj = then conceptually the procedure 

is as followSc , The angle^ 0^ subtended at the origin in p- 
dimensional space (assuming each observational vector Zij 
has p elements) by Zi ^ and Zig is computed* If this is less 
than 0 c p then the mean vector^ C ip is calculated and this 
becomes the first centroid* If 0 > 6c p then = Zn and 
Cg = Z 1 3« Then Zis is attached to whichever centroid with 
which it makes the smaller angle^ unless the angle is 
greater than 6^ in which case a third centroid is formed* 

The centroid is recomputed with each additional observation p 
and a ’’moving" angular standard deviation is also computed* 
This procedure is carried out for every element in the first 
scan line* This defines the set of initial or trial 
centroids on which the sample points are to be "condensed*" 
It can be seen that the number of initial centroids is 
controlled by the initial critical angles the larger the 
angle p the smaller the number of initial centroids* 

Each sample point is then located in turn on the data 
tape and is attached to the nearest centroid unless it 
deviates from this by an angle .greater than some :multiple of 
the angular standard deviationp in which case^ it will form 
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a new centroid* If the point is accepted into an existing 
cluster, the mean vector and angular standard deviation are 
adjusted, and immediately adjacent points to the left and 
right along the scan line are tested to see if they are 
within the same cluster* If so, they are accepted and the 
centroid statistics are recomputed; if not, the next sample 
point is located* This technique makes use of the fact that 
there is a high probability that immediately adjacent 
observational points are spectral measurements of similar 
objects because of the spacial pattern relationships that 
exist in these data* The effect is to considerably augment 
the sample size at little additional computational cost* 

After all sample points and their neighbors have. been 
allocated, clusters that are represented by only one sample 
point are dropped* .. The remaining clusters are- then tested 
to find if any overlap by one standard deviation* If so, 
the overlapping clusters are fused into one* The clusters 
are then sorted in descending order of their sample size* 
Clusters that have been formed from only a few sample points 
can be dropped (the user can specify the minimum 
proportional sample size for a cluster) , and if there are 
still more than ten clusters remaining, the least 
represented clusters are dropped until only ten remain* 
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If the user now wishes to obtain a digital map of the 
same area, the tape is rewound and a character is assigned 
to each cluster spectral signature*, Each observational 
element is assigned the character of the nearest cluster 
unless it is outside any cluster by some user-supp lied 
multiple of the angular standard deviation in which case it 
is assigned a blank character*, The matrix of characters so 
formed is printed out as a digital niap« 

Use of .the Pr ogra m 

ACLDS uses original or subset data tapes in the Digital 
Aircraft Multispectral Scanner Data Tape Format* Control 
cards are used to specify the flight line or scene (tape file 
name) that is to be processed^ the channels to be usedp and 
the blocks to be classified and mapped according to scan 
line and element designations*. In addition^ they are used 
to specify the number of sample points to be selected, the 
initial critical angle for clustering, and the angular 
standard deviation multiplier used in the element-by^element 
classification for mapping* Options also exist for deleting 
low-reliability clusters, for obtaining extended output. 
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for obtaining the map only, and for deleting the. mapping 
phase - 

The following hints are offered for the benefit of new 
users« 

Eun the program first with all default options^ 

To 4® this, input only the BLOCK card (s) <> 

2« Compare the output map with the photographic 

imagery* For more detailed mapping^, reduce the 
initial critical angle* For less detailed 
mapping, increase the critical angle or delete 
the low-reliability clusters* . 

3« To classify more of the mapped area 

(i*e«, reduce the unclassified blank area), 
either (a) increase the initial critical angle 
or (b) increase the standard deviation multiplier* 
To obtain more. blank unclassified area, either 
(a) decrease the initial critical angle , <b) 
decrease the standard deviation multiplier, or 
(c) delete the low-reliability clusters* 

4o , Adjust the sampling intensity if the size of the 
test block is very large or very small* 

5* Vary only one factor at a time so that the effect 
is not confounded* 



57 


6. Generally, the initial critical angle should be in the 
range I'’ to 10° and the standard deviation multiplier 
between 2 and 5. The number of sample points cannot 


exceed 900, 
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CANAL PHOGRAM DESCEIPTION 


H« Lachowski and F* Y« Borden 


The CANAL program computes the canonical analysis for 
categories of multispectral scanner data based on the mean 
vectors and covariance matrices for the categories® , The 
categories are defined and their basic multivariate 
statistics are obtained prior to the use of this programp 
for example, by the use:of training areas and the STATS 
program® Each category is defined by a mean vector composed 
of the set of averages, one for each spectral band, and the 
corresponding covariance matrix® These basic statistics, in 
addition to the category names and mapping symbols, form: the 
main part of the input to CANAL® 

In the first part of the program, a canonical : analysis 
is performed on the data for all categories® The minimum 
number of linear transformations yielding the maximum 
separability among the categories is obtained as a result of 
the canonical analysis® , In the second part of the program. 
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each observation is first transforiued using the linear 
tran sf ormations and then classified according to its 
euclidean distance of separation {in a multidimensional 
geometric sense) from the transformed mean vector of each of 
the categories. Classification is made into the category 
for which the distance is smallest if the distance is within 
a specified limit. If the distance exceeds the limits’ the 
observation remains unclassified. The observation is then 
translated into the mapping symbol for the category to which 
it was assigned, , A map of the classification results is 
output. 

Additional output consists of auxiliary tables showing 
various matrices computed in the canonical analysis and 
distances of separation between all pairs of categories. 

The program may be run for the statistical information alone 
by optional termination of the program prior. to the 
classif ication and mapping computations. 


Use of the_Pr ogr am 

The canonical analysis program uses original or subset 
tapes in the Digital Aircraft Mult ispectral Data Tape 
Format, Control cards are used to: (1) input the 
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specifications for each category,, i»e« ,, the nuEber of 
observations p mean vector ^ covariance matrix^ and the 
category name; (2) specify the flightline or tape file’ name; 

(3) specify the channels to be used and the blocks of data 
to be classified and mapped; (4) set category mapping 
symbols and limits; and (5) set various processing options. 
Default options^ which are appropriate in many 
situations, minimize control card preparation* Output 
consists of two parts; the canonical analysis and the 
classification and mapping. Part one consists of a title, 
page with the control specifications for the run* The most 
important output of this part is the transformation matrix 
and the canonical axes that are used as the new signatures 
for the given categories* The output also contains various 
matrices computed in the canonical analysis* Part: two 
consists of the table of separations for all pairs of 
categories and map pages for each block requested* , For each 
block, a summary table is output of the classification 
results by number and percentage: in each category* A 
control card allows the user to terminate the program before 
the classification and mapping is performed, thus saving 
substantially on run time. This option is useful when the 
primary interest for the run is the canonical analysis and 
the table with separations of pairs of categories, . 
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Compu tational M e tho ds 


Canonical An al ysis - 

Consider several p -variate universes^ say h in number. 
Each universe may be conceived of as a swarm. of points in 
p-dimensional space centered at a point characterized by a 
vector M- and dispersed about this point in an ellipsoidal 
pattern characterized by the covariance matrix The 

universes under consideration overlap to a greater or lesser 
degree and the mean vectors are more or less distinctly 
separated^ A finite sample of observations can be obtained 
from each of the h p-variate universes^. Since canonical 
analysis was explored for its potential use in the analysis 
of multispect ral scanner data^ it will be presented in this 
frameworJ<. Each sample of observations corresponds to a 
training set for a given category (target) o Each training 
setf which is defined by the invest igat or ^ is chosen to be a 
representative sample of data for a homogeneous target; 
ice«<, a target that has uniform characteristics differing 
from point to point within the target area only by random 
variability. Multispectral scanner measurements are taken 
as the exemplification of the uniform characteristics. 

Each observation will be represented as a p-component 
vector^ 
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where p is the number of channels for element 3 in scan line 
ic The sample mean vector for the kth category (k = 1 , 2 , 

« « V , pi rs 


all i j 

^ 6 1 j ^1 j 

k 


all ij 

where = T 6^^ for 61^- 1 if element j in scan line i 
belongs to the training area for category k, and. = 0 if 
element j in scan line i does not belong to the training 
area k, category k« 

The sample covariance matrix for the kth category is 

all ij _ _ 

f:, = t Xt J, - xj (61 J Xi J, - X,) * 


In addition to this, X is defined as a p x h matrix of 
all the category means composed of all X^, k = 1, 2, -.0, h. 


mean vectors as 
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cat« 1 cat« 2 c <. « cat« h 



The annotation of the above matrix indicates the category 
and channel organization of the matrix « Let N and n be, 
respectively, an h x h matrix and an h x 1 vector of the 
number of observations in the categories as 



The method as presented here is based on the method by 
Bartlett^ and by Seal2„ it differs, however, from 

iM. S. Bartletto 1938™ "Further aspects of the theory 
of multiple regression P roceedings o f the Cambr idge 
phila nthropic S ociety ^ 34 « 

2H* L* Seal. 1964^ H ulti variate St a tis t ical - An alysis 
for B io l ogi sts. Methuen and Co« , Ltd., London. 
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their presentations after the initial steps and is directed 
toward remote sensing data processing and analysis. The 
introductory details will not be repeated here since readers 
may refer to Seal (1964) for background information. 

The objective of canonical analysis is to derive a 
linear transformation that will emphasize the differences 
among the sample estimates of the means of the :h universes. 
In other words, the objective is to define new= coordinate 
axes in directions of high information content useful for 
classification purposes. 

The desired transformation for the general X and Y is 

Y = CX 

where C is the g x p transformation matrix where g < p, and 
y is the transformed g^element observation vector. For 
every Xij referenced by scan line i, element j, ^ = CX ^ j 
with 



The reason for g < =p is explained later. 
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Let W be the combined covariance matrix for all the 
categories; commonly referred to as the "within” category 
covariance matrix and computed as 


I'i n, - V' ' 

]1 

s 

(nj, - 1 

jS 

II 

Li-' 




where is the covariance matrix for category ip is the 
number of observations for category ip and h is the number 
of categories. Let P be the "among” categories covariance 
matrix defined as 

P = 7 7* N - ^ — — {X n) (X n) « 

h. 

E Hi 

i- 1 

In order to meet the objective of finding C so that the 
differences among the groups are emphasized p CPC* must be 
maximized since CPC* will be the "among” covariance matrix 
for the transformed variables, Y, The matrix C can only be 
made to be unique if additional constraints are placed on 
it. The constraint that C WC * = I, for I the g x q identity 
matrix, will suffice and, in additionp has the highly 



desirable effect that under this constraint the transformed 
variables will be independent and have unit variances* 
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Maximization of CPC’ subject to CWC*- I cannot be 
solved directly; therefore, it has to be cast into a 
different form* This is accomplished in the following 
manner* Defining W 1)2 so that WV^’Wi/2 — w and usincy 
IO/ 2 w-'i /2 = I and {CW V2) » = wV^ C’ then 

CPC’ = CWV^W- 1 / 2 PW- 1/2 » W1/2 ® C’ 

= {CW 1 / 2 ) W- 1 / 2 PW- (CWi/ 2 ) ’ 

In addition, CWC* = I may be written as (CWV^^) - I* 

Let F = CWi/ 2 ’, then FF’ = 1 * Let V - W- 1/2 pg- 1/2 1 ; then, by 
substitution of F and V, the problem is to maximize FVF^ 
subject to FF’ - I* This form is now a straightforward 
eigenvalue problem fox which the only remaining difficulty 
is in finding W1/2. 

In order to ob tain WI/2 , find A and A such that AaA’ = W 
constrained by AA ’ = I where a is a diagonal matrix of 
eigenvalues extracted from W, and A is a matrix of 
corresponding eigenvectors. From this, W1/2 =: A1/2A’* 
Furthermore, W-1/2 =: aa“^/ 2 . 



67 


Once W- has been computed, the product T = 
can be obtained* The next next step is to find Z and F such 
that 


= FZF' 


and 


FF’ = I 


where Z is the diagonal matrix of the eigenvalues of T, and 
F is a matrix of corresponding eigenvectors. These m atrices 
are as follows; 


Ai 0 , V, 0 

0 ® o 0 


z 





3 ' 


f ] 

P 


0 


0 



The p eigenvalues of T are only distinguishable when p < 
h - 1, In the case .when p > h - 1, there are p - h - 1 
zero eigenvalues (or computational approximations 
of zero values) and h - 1 distinguishable: non-zero 
eigenvalues, A suitable procedure.. to test whether all the 
eigenvalues after the gth can be. ignored because they 
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are computational approximations of zero is the 
Bartlett’s test. ^ Bartlett’s test is based on the 
fact that 

\ In 

' (ifi ■ 7 ■ 'P " *'>^4 a=q+1 ^ 

is approximately a chi-sqnare variable with (p - q) times 
(h - q - 1) degrees of freedom when + j + s = . „ , = 0,. 

Here m is the smaller of h *- 1 and p. This is accom- 
plished by the testing of successive X’s for the given 
condition and stopping when the condition is satisfied. 

Following this, Z is partitioned into g by g and p-g by 
p-q submatrices. The q by q partition, 

Xj 0 ... 0 

0 Xg ... 0 

z* = : : : 

• * » 

0 0 ... X 

contains the distinguishable eigenvalues and will be used 
as a discriminant space. In a similar manner, F is 
partitioned, 

iH. S. Bartlett. , 1947. ’’Multivariate analysis.” 

Journal Hoyal S t atisti c al - Society , Supplement 9. 
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= C f fa ] 

As a result of partitioning Z and using a reduced 
discriminant space, only a certain portion of the total 
variance will be retained* The percentage of variance 
retained may be found from the following equation: 

£ Ai 

T = 100 — 

P 

£ 

i- 1 

The equation for FZF* (page 10) now becomes 
F*» - W-l/2 P 

and 

F* F:^t^ z= J 

The transformation matrix C may be computed from the 
equation for F (page 10), which becomes F* = C*WV^"* 
Therefore, which is now a q x p matrix, is computed as 

c* - 
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As mentioned before, the possible rank q of the 
discriminant snbspace depends on the relative sizes of p, 
tne number of elements in the X vector, and on h, the number 
of categories- If h - 1 is less than p, then h - 1 is the 
maximum possible rank of the discriminant space- .. For 
example, if two cl asses are used, their centroids will have 
to fit on a single line, the centroids of three classes will 
have to fit on a plane, four classes in a three dimensional . 
space, and so forth. If, however, h — 1 .> p, it is possible 
to have as many as p canonical axes. If the smaller of p 
and h - 1 is quite large, one might decide to use less than 
the maximum number of axes for reasons of parsimony- In 
most cases, a reduced rank discriminant space yields 
adequate results when employed in classification- The 
ultimate aim is to reduce .the problem of distinguishing 
between multivariate populations to the scale of a single, 
variable. 


Clas s if ica ti on P rocedur e 

The classification method used here is based on 
comparison of the euclidean distance between the input 
observation {unknown to be classified) and the stored 
references- ,. Before the actual comparison takes place, the 
input vector is centralized; i-e-, the grand mean is 
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substracted from it. It is then transformed according to 
the canonical transformation described in the previous 
section. , ^he decision rule that was applied^ based on the 
comparison of euclidean distances, is as follows j Y belongs 
to A if lY - F|| < |Y - Yjt < C^; j = 1, h; i ^ j. In 

this case, h categories are considered. Here Ci is the 
threshold value or limit for category i, Y^ is the sample . 
mean for category i, and Y^, j = 1, h, are the sample 

means for the remaining categories. This rule partitions 
the space into h + 1 regions (h categories plus "other”). 

The unknown observation is classified as belonging to 
category i if it is within the boundary limit defined by the 
threshold for category i, which is C i. If it is outside 
all the h regions, the decision is made to classify the 
observation in the "other” or unclassified category. 

The procedure is different if the threshold value is 
not used. In this case, an unknown obi^rvation is 
classified as belonging to the category for which the 
euclidean distance is smallest, without other limitations. 
Every observation, therefore, is classified as belonging to 
some category, but this does not necessarily mean that the 
decision is a correct one. The thresholds are used mainly 
to avoid classification in one of the h categories when the 
likelihood of success is marginal. 
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RATIO PROGRAM DESCRIPTION 


F» Y« Borden 


The RATIO program is a classification and mapping 
program for mul tispectral scanner remote sensor data based 
on the ratio of two selected channels (spectral bands) <> The 
program was designed primarily for vegetation analysis^ 
therefore, the description is presented in this frame of 
reference. Using a general vegetation <or other) spectral 
signature specified by the user, data for each remote 
sensing unit that agree within a given tolerance to the 
signature are selected for ratio determination. For each 
remote sensing unit that is selected, the ratio of the two 
selected channels is computed and the remote sensing unit is 
assigned a mapping symbol corresponding to the class within 
which numerical boundaries the ratio value falls. For 
example, consider the two vegetation classes, coniferous and 
non-coniferous vegetation. It is well known that coniferous 
vegetation in general has less reflectance in the reflected 
infrared region than does non^-coniferous vegetation. By 
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choosing the ratio denominator channel as one of those in 
the chlorophyll region and the numerator channel as one of 
those in the reflected infrared regiour these two classes 
can be separated on the basis of the ratio. , The ratio 
values for coniferous targets will be lower than those for 
the non-ccniferous targets. The separation bounds for the 
targets must be specified by the user and can best be 
determined experimentally by using a sample of data from the 
scene to be analyzed. 

In addition to map output, the frequency distribution 
table for ratio values is printed. This table is of 
particular value in choosing categories and in setting the 
bounds for the categories. 

Computationa l Methods 

The data for each element in each scan line is 
considered as a vector, say X^j for scan line i, element j. 
For p channels, Xj j is composed as 


the normalized analog of X^ j so that Zj j = Xij(xlj 

and Zi J 7 be a p-element vector, the values 


i i i 


1 i 2 


1 J P 


Let Z, , be 

* J 
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for which are specified by the user to be the signature for p 
say, vegetation and let W be the normalized analog of V« 

The user determines whether normalized or unnormalized data 
will be used« This option is only effective in the 
screening of the prior to the ratio computation-, If the 

unnormalized option is selectedp the X; j are screened by 
selecting for the ratio computation only those X that have 

A J 

dij 2 (x,j -- V) {Xij V) < where D is the critical 

distance set by the^ user« The :d , ^ is the euclidean distance 
between the two points j and V in p^dimensional space-, 

For the unnorinalized optioup Zij are selected for ratio 
computation if t i ^2 = “ W) (Z.j W) < -T2p where T = 1/2 

arcsin (6/2) and 6 is the critical angle set by the user* 

The d^^ and t^^ are related geometrically in that t^j is 
directly related only to the angular separation of Xx j ^ 

whereas dij is composed both of the angular separation and 
the vector lengths of Xjj and All data that are screened 

out are assigned blanks as mapping symbols-, 

For the selected datap the ratio B x j = Xij^/Xi^^^is 
computed using channels k and ^ as designated by the user-> 
The selection of the normalized or unnormalized data option 
has no influence on the ratio since 

Let Bjip n = 1, mp be the upper bounds in ascending 

order for the ratio in classifying the ratios into m- 



categories; Bq - 0. Then if Rij> B ^ f or n - 1, k-1 and 

2yc» mapping symbol for class k is assigned to the 
element. If Ri j > Bj,, the mapping symbol for the last class, 
class m, is assigned. The user defines the bounds by input. 

The frequency distribution table of ratio values that is 
output is computed based on minimum and maximum ratio values 
specified by the user. For this table, one hundred equally 
spaced classes are set, with the minimum ratio as the lower 
bound of the first class and the maximum ratio as the upper 
bound for the last class. Values that are below the lowest 
bound or above the highest bound are assigned to the first 
or last class respectively. The frequency distribution 
output is valuable in setting the number of ratio categories 


and their bounds . 
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MERGE PROGRAM DESCRIPTION 


D, N« Applegate and F« Borden 


The MERGE program is used to merge : satellite data from 
two ORSER formatted data tapes, each tape containing one or 
more passes of the same area and each being from a different 
date. The. final merged tape may contain up to six different 
passes. These merged data tapes are useful in studying 
the effects of temporal change and to perhaps improve 
classification of certain targets* 


Tape Fo rmats 

Every tape generated by the MERGE program has the same 
tape format as the original data tapes. Every merged tape 
can be used by any of the programs that can operate on the 
original data tapes. The identification record of the first 
source tape is reproduced on the merged tape, except that 
the channels from the second source tape are added to the 
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channels from the first tape« The table of contents for the 
merged tape corresponds to the BLOCK data card described in 
the control card section^ All lines and elements on the 
merged tape are numbered as they are on the first source 
tape^ 


Use of the-Prograin 

The two source tapes may be merged tapes themselves, 
but both must contain the area to be merged. The block to 
be merged from the first source tape must be specified as 
input to the program. , Line and element differences from the 
first source tape to the second tape must also be input; 
these values are used to compute the block to be merged from 
the second source tape. One way to calculate these 
differences is to overlay digital maps from each tape; there 
should be no rotational effect evident. Channels from the 
source tapes are renamed to avoid duplicate ch ann el numbers. 
Program output consists of tape information pages for each 
of the two source tapes and one for the merged tape. 
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MAPCOMP PEOGEAM DESCEIPTIGN 


D. N« Applegate and F„ Yo Borden 


The MAPCOMP program is a program that coropares^ element 
by element, two digital classification maps of the same 
ground area- The program was designed especially to compare 
maps generated from merged satellite data. By using the 
MAPCOMP program, one can compare classification results 
from two different passes of the same area, each pass being 
from a different date; or one can study the results of 
classification using selected channels from different dates- 
Each map is classified , line by line, according to the 
computational methods described in the ACLASS <DCLASS) 
program description. Each element from each map is assigned 
a category symbol or a category number, depending on which 
option was specified in the control cards. The elements are 
then compared and a symbol assigned designating whether the 
elements were equal, the elements were not egual, the first 
element equaled a blank or the "other” category and the 
second did not, the second element equaled a blank or the 
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"other** category and the first did not^ or both elements 
equaled a blank or the "other" category* 


Use of t he Pr ogram 

Input to the program is any tape in the OHSER formatr 
but merged tapes are primarily used., Channels to be used 
for each map, and the limiting distances or angles for each 
map for each category, should be specified in the input to 
the program, or the default values will be assigned* One 
set of category cards apply to both maps* However, a 
separate set of signature cards may be input for each map. 
Brightness factors may be specified for each set of channels 
by the use of the NORM cards described in the control card 
section below. 

Program output consists of a title page giving the 
channels used in the classification of each map, and a list 
of the category names, symbols (if used) , limits, and 
category signatures for each map. A comparison map and a 



summary table listing the five cases noted in the introduction 
above, with corresponding 


counts and percentages, are output. 
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PRINCOM PROGRAM DESCRIPTION 
J. R. Hoosty 

The purpose of the Principal Components Analysis (PRINCOM) program 
is to compute a transformation matrix from a set of observations within 
a chosen data site for use when performing principal components analysis. 
The rows of the transformation matrix correspond to the eigenvectors of 
the data site covariance matrix computed from eigenvalues arranged in 
descending order of magnitude. The data site mean vector and covariance 
matrix are acquired from the ORSER program STATS. The mean vector and 
transformation matrix are output into the BAT file* $PRNCOM for presenta- 
tion to the classification programs when using principal components anal- 
ysis as a preprocessing option. 

The printed output consists of the following: 

1. Echo-check of the mean vector and covariance matrix for 
the data site. 

2. Eigenvalues of the covariance matrix. 

3. Matrix with eigenvectors in columns. 

4. Transformation matrix with eigenvectors in rows. 

5. Resultant matrix computed by multiplying the transformation 
matrix by its transpose, 

6. Percent of total variance represented by each vector in 
the transformation matrix and the cumulative percent variance. 

Output on BAT file $PRNCOM consists of the following; 

1. Data site mean vector 

2. Transformation matrix. 


The batch and terminal (BAT) file receives output from remote job 
entry terminals. This data can later be retrieved as output or used as 
input to another program. 
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HINDIS PROGRAM DESCRIPTION 
J. R. Hoosty 

The purpose of the Parametric Classifier with Linear Discriminant 
Function (HINDIS) program is to implement a minimum distance classifier 
based on pattern class means. The linear discriminant function used has 
the following form: 

A mean vector, m^, for each class is first computed from a set of train- 
ing patterns and then a selected data site is classified using these 
computed means and the discriminant function shown above. Classification 
is completed by choosing the class which corresponds to the largest 
discriminant function. 

The printed output consists of the following: 

1. Echo-check of input control cards. 

2. Echo-check of mean vector and transformation matrix, if 
principal components option is selected. 

3. Patterns from each training block (optional). 

4. Number of patterns in each class. 

5. Sample training patterns from each class. 
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PARAM PROGRAM DESCRIPTION 
J. R. Hoosty 

The purpose of the Parametric Classifier with Quadratic Discriminant 
Function (PARAM) program is to implement a classifier based on the sta- 
tistical parameters of a training set of patterns from selected classes. 
The covariance matrix and mean vector for each pattern set is acquired 
from the ORSER program STATS. 

When a pattern is input, a discriminant function is computed using 
the input parameters in the following form; 

g^(x) = In p(l) - (1/2) In |r.l - (l/2)[(x - m.) 

where p(i) is the probability that a random pattern, 2 ^, belongs to the 
i’th class. If the probabilities are unknown or otherwise omitted from 
the output, the program assumes the probability of each class to be equal. 

A training set, i.e., blocks of patterns known to belong to certain 
classes, is input along with a selected data site. PARAM first classifies 
the training set and outputs the percent correct classification of each 
test block, to give an indication of classifier performance. The program 
then classifies and outputs a map of the data site, followed by a summary 
for each class. The PARAM program uses the theoretically optimal dis- 
criminant function for normally distributed patterns. 

The printed output consists of the following: 

1. Echo-check of input control cards. 

2. Echo-check of mean vector and covariance matrices 
for each class. 

3- Echo-check of mean vector and transformation matrix, 
if the principal components option is selected. 



4. Transformed mean and covariance matrix for each class 


if the principal components option is selected, 

5. Inverse of and inverse times the covariance matrix 
for each class. 

6. Patterns from each test block selected. 

7. Percent correct classification of the test set. 

8. Heading with flight line information. 

9. Map of data site. 

10. Block summary. 
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NPAR PROGRAM DESCRIPTION 
J. R. Hoosty 


The purpose of the Nonparametric Trainer with Linear Discriminant 
Function (NPAR) program ia to find a set of weights for use in a linear 
classifier with a discriminant function of the form: 


g^(i) 


WiXi 


+ W2X2 


"d*d 


+ w 


d+1 


Training sets of patterns belonging to selected classes are input and 
nonparametric training is accomplished using the fixed increment rule 
as the error correction procedure. Weights from the final training run 
are output to the BAT file* $WTS for input to the NPARMAP program. 

The printed output consists of the following: 

1. Echo-check of input control cards. 

2. Echo-check of mean vector and transformation matrix, 
if the principal components option is selected. 

3. Heading with flight line information. 

4. Patterns from each training block selected. 

5. Training run number, percent correct classification of 
the training set, and values of the weights after each training run. 

6. Final classification of the training set after stopping 
rule was implemented. 

Output to BAT file $WTS consists of the following: 

1. Number of classes. 

2. Final weights matrix (classes, channels) 


fc 

The batch and terminal (BAT) file receives output from remote job 
entry terminals. This data can later be retrieved as output or used as 
input to another program. 
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NPARMAP PROGRAM DESCRIPTION 
J. R, Hoosty 


The purpose of the Nonparametric Classifier and Mapper with Linear 
Discriminant Function (NPARMAP) program is to classify and map a selected 
data site using a linear discriminant function of the form: 


g (x) = + W X + 




+ w 


d+1 


The number of classes and weights for each class are input from the 
NPAR program through the BAT file* $WT.S. A pattern is classified into 
the class which yields the largest discriminant function resulting after 
the weight vector of each class is multiplied by the pattern. 

The printed output consists of the following: 

1. Echo-check of input control cards. 

2. Echo-check of number of classes and weights matrix. 

3. Echo-check of mean vector and transformation matrix, 
if principal components option is selected. 

4. Heading with flight line information. 

5. Map of data site. 

6. Block summary. 


The batch and terminal (BAT) file receives output from remote job 
entry terminals. This data can later be retrieved as output or used as 
input to another program. 
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QUADNPAR PROGRAM DESCRIPTION 
J. R. Hoosty 


The purpose of the Nonparametric Trainer with Quadratic Discriminant 
Function (QUADNPAR) program is to find a set of weights for use in a 
classifier employing a quadratic discriminant function of the form: 




d-1 

+ 1 


j=l k=j+l 


w . , X , X, 
j k 


d 

+ I 


j=l 


w.x. 
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+ w 


d+1 


Training sets belonging to selected classes are input and the patterns 
are passed through a quadratic processor. Nonparametric training is 
conducted, using the fixed increment rule as the error correction tech- 
nique. After each training run the training set is classified and the 
percent correct classification is output. Weights from the final train- 
ing run are output to the BAT file* $QWTS for input to the QUADMAP program. 

The printed output consists of the following; 

1. Echo-check of input control cards. 

2. Echo-check of mean vector and transformation matrix, 
if principal components option is selected. 

3. Heading with flight line information. 

4. Patterns from each training block. 

5. Sample of quadratic patterns. 

6. Training run number and percent correct classification 
of the training set. 


The batch and terminal (BAT) file receives output from remote job 
entry terminals. This data can later be retrieved as output or used as 
input to another program. 



7. Final classification of the training set after the 
last training run. 

8, Final weights and the average percent classification. 
Output to BAT file $QWTS consists of the following; 

1. Number of classes. 

2. Indices for quadratic processor. 

3. Final weights matrix (classes, indices). 
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QUADMAP PROGRAM DESCRIPTION 
J, R, Hoosty 


The purpose of the Nonparametric Classifier and Mapper with 
Quadratic Discriminant Function (QUADMAP) program is to classify and 
map a selected data site using weights from the program QUADNPAR and a 
quadratic discriminant function of the form: 


d 2 ^ 

g (x) = y w, ,x. + y y w^x.x, 

- d=i J j=i U j 


d 


W ,X. + Wj , - 

3 3 


Each pattern in the data site is passed through a quadratic processor, 
and the resultant vector is multiplied by weight vectors from each class 
yielding a discriminant function. The number of classes, indices for 
the quadratic processor, and class weights are input from the QUADNPAR 

y 

program through the BAT file* $QWTS. Classification is completed by 
choosing the largest of the discriminant functions. 

The printed output consists of the following: 

1. Echo-check of input control cards. 

2. Echo-check of the number of classes, indices, and 
weights for each class. 

3. Echo— check of mean vector and transformation matrix, 
if principal components option is selected. 

4. Heading with flight line information. 

5. Map of data site. 

6. Block summary. 


The batch and terminal (BAT) file receives output from remote job 
entry terminals. This data can later be retrieved as output or used as 
input to another program. 



